Conversation
📝 WalkthroughWalkthroughReworks WhisperCPP backend transcription and resampling: replaces push_back with emplace_back and in-place initialization for segments/word timings, adds container reservations, and implements three resampling paths (early-exit, stride-based for exact multiples, and general interpolation) with tightened bounds and safety checks. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Poem
🚥 Pre-merge checks | ❌ 3❌ Failed checks (2 warnings, 1 inconclusive)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Important
Looks good to me! 👍
Reviewed everything up to f901cf0 in 14 seconds. Click for details.
- Reviewed
121lines of code in1files - Skipped
0files when reviewing. - Skipped posting
0draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
Workflow ID: wflow_hzUsXJ9MGf04l6Kz
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
| if (source_rate % WHISPER_SAMPLE_RATE == 0) { | ||
| int stride = source_rate / WHISPER_SAMPLE_RATE; | ||
| output.resize(samples.size() / stride); | ||
| for (size_t i = 0; i < output.size(); ++i) { | ||
| output[i] = samples[i * stride]; | ||
| } | ||
| return output; | ||
| } |
There was a problem hiding this comment.
Indentation inconsistent with surrounding code - missing 4 spaces
| if (source_rate % WHISPER_SAMPLE_RATE == 0) { | |
| int stride = source_rate / WHISPER_SAMPLE_RATE; | |
| output.resize(samples.size() / stride); | |
| for (size_t i = 0; i < output.size(); ++i) { | |
| output[i] = samples[i * stride]; | |
| } | |
| return output; | |
| } | |
| if (source_rate % WHISPER_SAMPLE_RATE == 0) { | |
| int stride = source_rate / WHISPER_SAMPLE_RATE; | |
| output.resize(samples.size() / stride); | |
| for (size_t i = 0; i < output.size(); ++i) { | |
| output[i] = samples[i * stride]; | |
| } | |
| return output; | |
| } |
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
Prompt To Fix With AI
This is a comment left during a code review.
Path: sdk/runanywhere-commons/src/backends/whispercpp/whispercpp_backend.cpp
Line: 566:573
Comment:
Indentation inconsistent with surrounding code - missing 4 spaces
```suggestion
if (source_rate % WHISPER_SAMPLE_RATE == 0) {
int stride = source_rate / WHISPER_SAMPLE_RATE;
output.resize(samples.size() / stride);
for (size_t i = 0; i < output.size(); ++i) {
output[i] = samples[i * stride];
}
return output;
}
```
<sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub>
How can I resolve this? If you propose a fix, please make it concise.| const size_t output_size = static_cast<size_t>(samples.size() / step); | ||
|
|
||
| std::vector<float> output; | ||
|
|
There was a problem hiding this comment.
Blank line with trailing whitespace after declaration
| std::vector<float> output; |
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
Prompt To Fix With AI
This is a comment left during a code review.
Path: sdk/runanywhere-commons/src/backends/whispercpp/whispercpp_backend.cpp
Line: 564:564
Comment:
Blank line with trailing whitespace after declaration
```suggestion
std::vector<float> output;
```
<sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub>
How can I resolve this? If you propose a fix, please make it concise.|
|
||
| const int n_segments = whisper_full_n_segments(ctx_); | ||
| std::string full_text; | ||
| full_text.reserve(n_segments * 64); |
There was a problem hiding this comment.
Trailing whitespace after reserve call
| full_text.reserve(n_segments * 64); | |
| full_text.reserve(n_segments * 64); |
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
Prompt To Fix With AI
This is a comment left during a code review.
Path: sdk/runanywhere-commons/src/backends/whispercpp/whispercpp_backend.cpp
Line: 277:277
Comment:
Trailing whitespace after reserve call
```suggestion
full_text.reserve(n_segments * 64);
```
<sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub>
How can I resolve this? If you propose a fix, please make it concise.| while (output.size() < output_size) { | ||
| size_t idx0 = static_cast<size_t>(pos); | ||
| if (idx0 >= src_size) idx0 = src_size - 1; | ||
|
|
||
| std::vector<float> output(output_size); | ||
| size_t idx1 = (idx0 + 1 < src_size) ? idx0 + 1 : src_size - 1; | ||
|
|
||
| for (size_t i = 0; i < output_size; ++i) { | ||
| const double src_idx = i / ratio; | ||
| const size_t idx0 = static_cast<size_t>(src_idx); | ||
| const size_t idx1 = std::min(idx0 + 1, samples.size() - 1); | ||
| const double frac = src_idx - idx0; | ||
| double frac = pos - static_cast<double>(idx0); | ||
| float val0 = src_ptr[idx0]; | ||
| float val1 = src_ptr[idx1]; | ||
|
|
||
| output[i] = static_cast<float>(samples[idx0] * (1.0 - frac) + samples[idx1] * frac); | ||
| output.push_back(val0 + static_cast<float>(frac) * (val1 - val0)); | ||
| pos += step; | ||
| } |
There was a problem hiding this comment.
Looping with push_back after exact size calculation reduces performance benefit of vectorization optimization. Consider using indexed writes with pre-sized vector since output_size is known upfront
Prompt To Fix With AI
This is a comment left during a code review.
Path: sdk/runanywhere-commons/src/backends/whispercpp/whispercpp_backend.cpp
Line: 597:609
Comment:
Looping with `push_back` after exact size calculation reduces performance benefit of vectorization optimization. Consider using indexed writes with pre-sized vector since `output_size` is known upfront
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@sdk/runanywhere-commons/src/backends/whispercpp/whispercpp_backend.cpp`:
- Around line 560-608: The resampling code can return an empty vector when
output_size computes to 0 for very small input chunks; clamp output_size to at
least 1 (e.g., size_t output_size = std::max(static_cast<size_t>(1),
static_cast<size_t>(samples.size() / step));) and mirror that clamping in the
integer-stride fast path (ensure stride_out_size = std::max<size_t>(1,
samples.size() / stride) and use that for output.resize and indexing), and
ensure subsequent loops use the clamped output_size/safe_output_limit so no
audio is dropped nor out-of-range indexing occurs (adjust pos/init or bounds
checks as needed).
|
will fix the indentation and push soon. |
|
We;re not using whisper c++ as of now for the voice use cases and it's just using the sherpa onnx. |
|
but will just merge - since itt's fine to merge. |
* Optimise resampling and heap thrashing opti * improved even more for integer ratios * Update whispercpp_backend.cpp
WatchFace: - Time truly centered (Arrangement.Center) — no more top-heavy layout - AI status dot moved to top-center, battery to subtle top-right (8sp) - MIC button: 36dp → 28dp, camera: 28dp → 22dp — less dominant - Bottom action row with spacedBy(16.dp) instead of Spacer hacks TranscriptionScreen: - Back button: dark background circle + arrow unicode (←) — visible on all bezels - Title centered via Box with Alignment.Center (not SpaceBetween) - Divider inset extra 4dp on watch to avoid round edge - Footer: entry count hidden on watch, "Clear all" centered, 8dp bottom padding CameraOverlay: - Close button: top-center on watch (not top-right corner) — avoids bezel clip - Brighter background (RunanywhereAI#333) for close button visibility - Capture/? buttons: bottom padding 16dp → 24dp — well within bezel - Preview: 120dp → 100dp — more breathing room around edges Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Changes
full_textandresult segmentsResults
Note
Important
Optimizes resampling and memory management in
whispercpp_backend.cppfor improved performance and reduced heap thrashing.resample_to_16khz()inwhispercpp_backend.cppfor vectorization, usingrestrictpointers and splitting loops to reduce checks.full_textandresult.segmentsintranscribe_internal()to reduce heap thrashing.This description was created by
for f901cf0. You can customize this summary. It will automatically update as commits are pushed.
Summary by CodeRabbit
Greptile Overview
Greptile Summary
This PR optimizes the WhisperCPP backend for better performance through vectorization-friendly code and memory pre-allocation strategies.
Key improvements:
restrictpointers and FMA-friendly interpolation (val0 + frac * (val1 - val0))full_text,segments, andword_timingsvectors to reduce heap thrashingpush_backtoemplace_backwith reference to avoid unnecessary copiesMinor issues found:
push_backafter calculating exact size, which could use indexed writes for better performanceConfidence Score: 4/5
Important Files Changed
Sequence Diagram
sequenceDiagram participant Client participant WhisperCppSTT participant VectorMemory as Vector (Pre-allocated) participant ResampleFn as resample_to_16khz Client->>WhisperCppSTT: transcribe_internal(audio, config) Note over WhisperCppSTT: Process audio segments WhisperCppSTT->>VectorMemory: full_text.reserve(n_segments * 64) WhisperCppSTT->>VectorMemory: segments.reserve(n_segments) alt word_timestamps enabled WhisperCppSTT->>VectorMemory: word_timings.reserve(n_segments * 15) end loop for each segment WhisperCppSTT->>VectorMemory: segments.emplace_back() Note over WhisperCppSTT: Direct construction in place alt word_timestamps loop for each token WhisperCppSTT->>VectorMemory: word_timings.emplace_back() end end end Client->>WhisperCppSTT: needs resampling WhisperCppSTT->>ResampleFn: resample_to_16khz(samples, source_rate) alt integer ratio (e.g., 48kHz -> 16kHz) ResampleFn->>ResampleFn: Fast path: stride-based decimation ResampleFn-->>WhisperCppSTT: return output else non-integer ratio ResampleFn->>VectorMemory: output.reserve(output_size) ResampleFn->>ResampleFn: Fast loop with restrict pointer Note over ResampleFn: FMA-friendly interpolation:<br/>val0 + frac * (val1 - val0) ResampleFn->>ResampleFn: Safe loop for remaining samples ResampleFn-->>WhisperCppSTT: return output end WhisperCppSTT-->>Client: return STTResult(4/5) You can add custom instructions or style guidelines for the agent here!